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(57) ABSTRACT 

In a computer system, data records stored in nonvolatile 
memory are read into a volatile memory and operated on in 
a sorting operation. A touraament-type sort is applied, with 
the tree size dynamically reconfigured within the volatile 
memory as a function of the number of data records to be 
sorted. The memory space occupied is reduced by the 
reconfigured tree and sort speed is augmented. 
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METHOD FOR SORTING AND STORING ing and descending sorts, respectively. These initialization 

DATA EMPLOYING DYNAMIC SORT TREE values ensure that the real data records to be sorted move 

RECONFIGURATION IN VOLATILE through the tree in the correct order. An initialization of the 

MEMORY sort tree further requires that a "loser attribute" be deter- 

5 mined for the interior nodes of the sorting tree. That is, since 

This application is a continuation of application Sen No. the initialization values loaded into the sorting tree all have 

08/236,513, filed May 2, 1994 now U.S. Pal. No. 5,619,693. the same nominal value, the initial losers and winners which 

move up the tree must be determined arbitrarily at the outset; 

BACKGROUND OF THE INVENTION that is, during initialization of the sort tree. 

1. Field of the Invention Although the tournament sort utilizing an initialized sort 

. . . J r as described above theoretically has the desired char- 

Jlie presen invention relates to methods for sorting, ^^^^^^^^^ „f ^ jg^ig„, i„e£5cien- 

slonng and retrieving sorted data in computer systems. In ^j^^ encountered when the number of records to be 

particular, the present invention relates to sorting and storing ^ ^ ^^^^^^ ^ ^^^^^ 

f cor utiT Stem "^""^ °^ " that an unknown number of data records be sorted in a single 

, sort, the largest sort tree which can be accommodated by the 

2. Description of the Prior Art and Related Information volatile memory of the computer system may be selected. 
Handling large databases is a significant part of many Such a large sort tree will have considerable computer 

applications of computer systems. For example, in a wide system time overhead associated with initializing the tree, 
range of applications from financial services to retail opera- 20 however Additionally, after initiaUzation and before the first 
tions and services, the handling of large databases in a sorted data records are read out of the tree, all the initial- 
eflScient manner is a key requirement of the computer ization values must first be read out since such values are 
systems employed in these industries. Frequently, the data- always "winners" relative to the real data records. Therefore, 
bases of interest include a large number of separate data at least a corresponding number of comparison steps will be 
records, which data records need to be sorted in a desired 25 required to read out all the initialization values from the sort 
order for efficient handling or searching. For example, such prior to getting actual sorted data records. Also, each sub- 
data records could include the pertinent information on sequent data record sorted must be compared up the entire 
employees in a corporation or account holders in a financial heigjit of the tree, which height is log N, where N is the 
institution. number of exterior nodes. If a relatively small set of data 
Such data records are typically stored in a high capacity records is actually to be sorted, it will be appreciated that 
nonvolatile storage medium such as disk drives associated creation of a large sort tree involves a considerable amount 
with the computer system. As new data records are added, of wasted computer time and uses an unnecessarily large 
however, or upon initial creation of the database or database part of the volatile memory. 

subset for storage, the sorting of the records into desired If a relatively small sort tree is selected, equally small sets 

order is performed. This sorting is performed in the volatile of data records to be sorted will be sorted in a close to 

working memory of the computer system which typically optimal manner. However, sets of data records which exceed 

has a more limited capacity than the nonvolatile memory, the sort tree size will encounter inefSciencies associated 

which capacity may be needed for a variety of tasks other with performing the sort in two or more separate runs 

than the sorting of the data records. followed by merging sorts. More specifically, undesirable 

Therefore, it is desirable to sort data records in the volatile I/O overhead may be associated with reading and writing 

memory of the computer system in as rapid and efficient ^^^a records to and from main nonvolatile storage or scratch 

manner as possible. It is further desired to minimize the files during the separate runs through the sort tree. Also, 

amount of input/output (I/O) between the nonvolatile stor- initializing the small sort tree multiple times followed by 

age medium and the volatile memory due to the relatively one or more merge sorts will inevitably waste computer time 

slow nature of I/O operations relative to the operational as compared to a single sort. 

speed of the computer system. Accordingly, it will be appreciated that the user of the 
One highly efficient sorting technique which has been computer system is faced with a "Catch-22" when under- 
employed in the art is the so-called tournament sort. This taking a sort of an unknown number of data records. Choice 
approach is described, for example, in Knuth, Donald E., 50 of tree size which is either too large or too small will 
The Art of Computer Programming, Volume 3Sortmg and inevitably involve inefficiencies and wasted computer time 
Searching Section 5.4.1, pages 251-266, Addison-Wesley which could otherwise be devoted to sorting. Such wasted 
Publishing Company (1973). In this approach to sorting data ti^ie and inefficient use of working memory may be very 
records, a sort tree having a number of nodes configured in significant where large databases are involved or where a 
a hierarchical tree structure, is first created in the working 55 large number of separate sorts are required, 
memory of the computer system. Data records to be sorted Accordingly, it will be appreciated that a need exists for 
are inserted into the bottom exterior nodes, or leaf nodes, of an improved method for sorting unknown quantities of 
the sort tree, and the data records are compared up the tree database records. It will further be appreciated that such a 
in a tournament compare fashion until the "winners" emerge method is needed which can optimize the use of available 
at the top of the tree in sorted order. volatile memory and which can minimize the I/O overhead 
Prior to introducing the data records to be sorted into the associated with transfers between nonvolatile and volatile 
sorting tree, however, the sorting tree first must be initial- memory, 
ized. This initialization process involves introducing prede- 
termined values into the tree structure which values wiU SUMMARY OF THE INVENTION 
always win in any comparison with real data values. For 65 llie present invention provides a method for optimizing 
example, such initialization values may lake the form of ^ volatile memory usage and minimizing sort time in sorting 
negative infinity (-00) or positive infinity (+00), for ascend- unknown or variable numbers of database records. 
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In accordance with the present invention, data records to 
be sorted are read into volatile memory and data record 
identifiers including a sort key and a pointer to a specific 
volatile memory location, are created for each data record. 
A sort tree having interior and exterior nodes hierarchically 5 
arranged is then created in volatile memory and initialized in 
a predetermined ordered fashion. The nominal sort tree size 
may be selected by the user or be predetermined, e.g. as the 
maximum size sort tree compatible with the constraints of 
the available volatile memory space. Then, data record jq 
identifiers, including the key and pointer, are introduced into 
the tree in an order which moves across the exterior nodes 
of the tree rather than randomly populating the exterior 
nodes. The sort tree is dynamically altered, during or after 
introduction of the data record identifiers into the tree, to 
optimize the effective size of the sort tree. After the data 
record identifiers have all been input and the tree is dynami- 
cally reconfigured, the sort proceeds, with the keys being 
compared up the tree and the keys and pointers shifted in 
volatile memory into the sorted order. TTie sorted pointers 20 
are then used to read the data records from volatile memory 
back into volatile memory in sorted order 

Since the sort tree is dynamically reconfigured to an 
optimized effective size, selecting the maximum nominal 
size of the sort tree has the advantage of minimizing the 25 
number of times which the sort tree will need to be initial- 
ized as well as minimizing inefficiencies attendant to per- 
forming sorts on separate runs and merging the results of 
those mils. In addition, I/O overhead may be reduced by 
minimizing the number of times that data must be read and 30 
written from nonvolatile memory during the separate mns 
through separate sort trees. 

In a preferred embodiment, the sort tree is dynamically 
reconfigured as it is created as data record identifiers are read 
in. That is, the sort tree is grown as necessary to accommo- 35 
date data record identifiers introduced into the nascent tree. 
The sort tree employs a movable root node which is always 
set as low as possible in the sort tree. The root node is moved 
upwards as needed when data records are added. After the 
dynamically created and initialized sort tree is completed 40 
and all data record identifiers have been loaded, the data 
record key values are sorted using a compare rule in which 
a key value at a lower level in the sort tree hierarchy will 
leapfrog key values of equal value when they are compared. 

In an alternative embodiment, a sort tree is completely 45 
initialized and data record identifiers are then read into the 
exterior nodes of the sort tree in the above-described ordered 
manner. Once all data values have been loaded, the sort tree 
is dynamically reduced to a more optimal size. One pre- 
ferred reducing operation is to dynamically truncate, or 50 
"prune," the tree by eliminating unused exterior nodes and 
corresponding interior nodes. Data sorting may then proceed 
in the reduced tree using the above -no ted compare rule. In 
an alternative embodiment all unused nodes are changed to 
a value corresponding to a predetermined loser value; i.e. a 55 
value which will lose all compares. Those nodes associated 
with dynamically changed loser values then become a 
dormant background of the sort tree since these values do 
not advance during compares. This effectively reduces the 
size of the sort tree. This approach may be combined with 60 
the pruning approach where sort consistency considerations 
prevent pruning all unused nodes from the tree. By reducing 
the size of the sort tree after initialization, the number of 
compares required to eliminate initialization values and to 
remove sorted data identifiers from the tree, is reduced. 65 
ITierefore the present invention provides a method for 
sorting and storing database records using volatile and 
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nonvolatile memory in which the size of the sort tree inay be 
effectively reduced automatically. 

Further features and advantages of the present invention 
will be appreciated from review of the following detailed 
description of the invention. 

BRIEF DESCRIPnON OF THE DRAWINGS 

FIG. 1 is a system block diagram in accordance with the 
present invention. 

FIG. 2 is a diagram of an exemplary sort tree in accor- 
dance with the present invention. 

FIG . 3 is a diagram of an initialized nine exterior node sort . 
tree in accordance with the present invention. 

FIG. 4 is a diagram of an eight exterior node sort tree with 
three data record identifiers inserted in accordance with the 
present invention. 

FIG. 5 is a diagram of a pruned sort tree resulting from the 
sort tree of FIG. 4 in accordance with the present invention. 

FIG. 6 is a diagram of an eight exterior node sort tree with 
four data record identifiers inserted in accordance with the 
present invention, 

FIG. 7 is a diagram of an eight exterior node sort tree with 
four data record identifiers inserted and with RUN 2 values 
inserted in accordance with the present invention. 

FIG. 8 is a diagram of an eight exterior node sort tree with 
the left branch initialized in accordance with the present 
invention. 

FIG. 9 is a diagram of a portion of a sort tree with the left 
branch initialized and two data record identifiers input in 
accordance with the present invention. 

FIG, 10 is a diagram of a sort tree of in accordance with 
the present invention with interior nodes initialized and three 
data record identifiers input in accordance v^dth the present 
invention. 

FIG. Ma is an illustration of a transportable floppy disk 
upon which implementing code is written in accordance 
with the present invention. 

FIG. lib is an illustration of a transportable computer tape 
upon which implementing code is written in accordance 
with the present invention, 

FIG. 11c is an illustration of a transportable optical disk 
upon which implementing code is written in accordance 
with the present invendon. 

FIG. 12 is a flow diagram associated with Appendix 1.2 
in accordance with the present invention. 

FIG. 13 is a flow diagram associated with Appendix 1.3 
in accordance with the present invention. 

FIG. 14 is a flow diagram associated with Appendix 1.4 
in accordance with the present invention. 

DETAILED DESCRIPTION OF THE 
INVENTION 

In accordance with the present invention, data is sorted in 
a data processing system. As illustrated in FIG. 1, a data 
processing system 10 may be used. Typical data processing 
systems which may be used include mainframe computers, 
workstations or even personal computers. Also, multiple 
systems coupled in a network, with data records shared 
between systems on the network may be employed. 
Furthermore, the data processing system may include mul- 
tiple subsystems operating in a fault tolerant manner or may 
include such subsystems operating in a parallel processing 
environment where portions of a given sort task are allo- 
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cated to different processors. Also, such data processing The sort tree preferably is filled in accordance with a 

system or systems may effectively employ the present sorting order set forth in the pseudo-source code annexed in 

invention when utilizing a variety of operating systems and the Appendices. Appendix 1.1 is pseudo-source code corre- 

programming languages. • sponding to the first embodiment discussed below. Appendix 

As illustrated in FIG. 1, typical data processing system 10 5 1.1 is associated with the flow diagram in FIG. 12 which 

includes a central processing unit 20 ("CPU"). The CPU is illustrates the steps in the pseudo-source code. Appendix 1.2 

connected through a bus 30 to, inter alia, volatile memory 40 is pseudo -source code corresponding to the second embodi- 

(also called RAM memory), non-volatile memory 50 (such ment discussed below. Appendix 1.2 is associated with the 

as disk drives, CD-ROMs and magnetic tapes), an input flow diagram in FIG. 13 which illustrates the steps in the 

means 60, such as a keyboard and a removable media drive pseudo-source code. Appendix 1.3 is pseudo-source code 

65 such as a floppy disk drive, CD-ROM drive, CD-WORM corresponding to the third embodiment discussed below, 

drive or tape drive. Appendix 1.3 is associated with the flow diagram in FIG. 14 

It is desired to sort data contained in a database. The which iUustrates the steps in the pseudo-source code. Each 

database may be stored either in the RAM memory 40 or the of Appendices 1 .1-1.3 refer to computer routines in Appen- 

non- volatile memory 50, but in the preferred embodiment, it dix 1.4. 

is stored in the nonvolatile memory 50. This preference is ^ The code in the Appendices are not specific to any 
based on practical necessity. Large databases of the type particular computer language; it can be written in any 
handled in many computer systems wiU require high capac- computer language, such as C, C++, Assembler, Cobol or 
ity storage devices. These databases typically will be stored Fortran. The code, including compiled or binary versions, 
in high capacity nonvolatile storage systems. However, in may be stored in volatile memory 50, or on a removable 
some applications it also may be desired to store database media received by removable media drive 65. Likewise, the 
records in a volatile memory, co^je ^as well as compiled or binary versions) preferably 
Various types or categories of database records and infor- may be implemented on a transportable media, such as 
mation may be stored and sorted. For example, in a financial floppy disks, magnetic tape or optical disks, as illustrated in 
services apphcation a database may store names, associated ^5 FIGS. Ua, lib and 11c respectively. As is known to those 
addresses and account numbers. In this example, each data skiUed in the art, the code may be formed upon the trans- 
record has associated data items. For example, credit card portable media as magnetic flux reversals, or in the case of 
account number XX may have associated name NN and optical disks, in the form of changes optical reflectivity of 
address AA. The records may be sorted based on any of the medium. 

these data items. The data item on which the sort is based is a nine exterior node tree is iUustrated in FIG. 3. The 

referred to herein as a sorting key. Thus, if it is desired to sort exterior nodes 70 are each given a node number shown in 

the records m ascending order based on one of the data piG. 3 adjacent the upper left comers of each node. This 

types, such as credit card numbers, then the credit card numbering scheme is repeated in succeeding figures. The 

number is referred to as the sort key. It also may be desired nine exterior node numbers are in ascending order, from left 

to keep track of the associated data items, such as the names 35 to right, starting with "0" and ending with the ninth node, 

and addresses associated with each credit card number. numbered "8". Although the nodes are depicted as being 

To initiate a sort, data records are read from nonvolatile numbered from left to right in the figures, any numbering 
memory 50 into volatfle memory 40. A segment within scheme may be used, such as from right to left. An ordered 
volatile memory 40 may preferably be aUocated to the sort. loading of the exterior nodes is preferred. Specifically, the 
A sort key is then selected (e.g. account number) and a data 4^ left most node should be loaded first, then the adjacent 
record identifier, including the key value and a pointer, is unpopulated node must be loaded after its immediate neigh- 
then created for each data record. The pointer contains a bor is loaded. This may be accomplished by loading from 
logical memory address in the volatile memory locating the left to right in the depiction of FIG. 3 or in accordance with 
associated data record. the techniques in the Appendices. Each exterior node 70 is 

A sort tree is also created in the allocated segment of 45 populated with a wirmer value, namely -00 in the illustrated 

volatile memory 40. The sort tree includes exterior nodes 70, example. Of course other values may be winners in different 

interior nodes 80 and a root node 85, hierarchically sorts. For example, in a descending sort, -00 would be a 

arranged. An exemplary sort tree 90 is shown in FIG. 2. A winner value. In any event, in any sort, any RUN 0 value 

location in memory 40 is allocated to each node as the tree will be deemed a winner over any RUN 1 value. 

IS created. 5q The interior nodes 80 are each populated with Loser 

Any size sort tree within the constraints of the allocated Attribute values and RUN Values. In the figures, the Loser 

segment of volatile memory 40 may be selected. In a Attribute and RUN values are depicted separated by a colon 

preferred embodiment, the maximum tree size is selected, (":"), although any depiction also may be used. For example, 

such that as much as possible of the available space of the interior node 4 is populated with a Loser Attribute of 2 and 

RAM memory 40 is occupied. This may be done without 55 a RUN number of 0. The Loser Attributes correspond to the 

user input, automaticaUy allocating all the available memory exterior node with that number. Thus, the Loser Attribute of 

in the allocated segment for the sort tree. Alternatively, the 2 corresponds to exterior node 2. The loser attributes of the 

user may select a sort tree size, based upon some estimate of interior nodes 80 are set so that input or data values are filled 

the maximum likely number of data records to be sorted. in left to right order in the example illustrated. 

The sort tree is initialized by filhng the tree with initial- 60 Data values are read into the exterior nodes of the tree, 

ization values in a specific order. These initialization values Data records stored in the non-volatile memory 50 are 

are values. set so that they always will win in a comparison accessed tising CPU 20 and data bus 30. Preferably the data 

with key values that are being sorted: For example, if an records are transferred to the RAM memory 40 as needed 

ascending sort of numbers is desired, a suitable initialization and read into the exterior nodes of the tree. Preferably each 

value will be negative infinity (-00). Th& -00 value will be 65 data item has a data value or data identifier and an associated 

less than any key value with which it is compared; thus, it pointer which points to associated data stored in the database 

will win in an ascending sort. in the non-volatile memory 50. 
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Compares between values are conducted as in the typical 
tournament-type scheme. Namely, the winners of a compare 
in lower nodes are compared with the current value of the 
nodes above. However, a different compare rule is applied. 
Specifically, the data values at a lower level in the sort tree 
hierarchy are ordered to leap past data values of equal value 
when they are compared. Likewise, the winning data values 
are ordered to leap past the losing values in compares. For 
example, in an ascending sort, a data value will leap past 
+oobecause it is numerically less and accordingly is a winner 
in an ascending sort. 

In a first, preferred, embodiment for optimizing data 
processing performance of a sort tree, the data values are 
read into a sort tree which has been initialized as discussed 
above. FIG. 12 illustrates steps for loading the data, FIG. 4 
illustrates a sort tree into which data values have been input. 
An eight exterior node tree is illustrated. The three input data 
values — ^"aa", "cc", and "bb" — are in exterior nodes 0, 1 
and 2 respectively. 

Once the total number of data items has been determined, 
such as by inputting all the data items into the sort tree, the 
tree is pruned in order to ehminate untised nodes. Typically, 
an end-of-file (EOF) is detected in the input data stream 
immediately after the last data item is reached. Once the end 
of the data stream is detected, the pruning process may 
commence. Specifically, the subtrees consisting solely of 
interior nodes containing RUN values of 0 (corresponding to 
RUN 0) are pnmed. In the FIG. 4 illustration, interior nodes 
1, 3, 6 and 7 are pointing to RUN 0 initialization values. To 
accomplish the pruning, the node highest in the tree pointing 
to an actual value (i.e. a RUN 1 value) is redefined to point 
to the root node of the tree. In FIG. 4, that highest node is 
interior node 2. 

The resulting pruned tree consists only of the root node 
and the interior nodes pointing to actual values (in FIG. 4, 
interior nodes 0, 2, 4 and 5) along with the corresponding 
exterior nodes (in FIG. 4, exterior nodes 0 through 3). The 
pruned tree resulting from the example illustrated in FIG. 4 
is illustrated in FIG. 5. The lop interior node in the hierarchy 
points to the root node, since it is the highest interior node 
pointing to an actual value. In FIG. 5, interior node 2 is the 
top interior node in the hierarchy. The sort then continues 
with the new tree structure. 

In a second embodiment for optimizing the performance 
of a sort tree, which is adapted to be used in combination 
with the first embodiment, all the data values are read into 
a sort tree which has been initialized as discussed above, 
FIG. 13 illustrates steps for loading the data. The sort is run 
until an end of file (EOF) is detected. Then the RUN 0 
initialization values (i,e, -oo in the examples discussed 
above) are bypassed thereby achieving access to the data 
values faster. This is accomplished by replacing all RUN 0 
initialization values with RUN 2 loser values. In one 
embodiment all of the nodes of the tree are examined to 
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values have been input. The data values — "aa", "cc", "bb" 
and "dd" — are in external nodes 0, 1, 2 and 3 respectively. 
Once all data items have been input into the sort tree, an 
end-of-file (EOF) indicator typically will be detected and the 
EOF value (a loser) may be input into the next exterior node 
(i.e. node 4 in FIG. 6). The loser value in the ascending sort 
illustrated will be +oo. Once this first EOF value is detected, 
all RUN 0 initialization values are changed to RUN 2 loser 
values. Specifically, the sort tree set forth in FIG. 7 results 
from the sort tree in FIG. 6, In each of the available exterior 
nodes (numbers 4, 5, 6 and 7), loser values replace the RUN 
0 initialization values. In the example shov^ in FIG. 7, the 
loser values are +oo. In addition, the interior nodes populated 
with RUN 0 initialization values are each changed to indi- 
cate RUN 2 loser values. In FIG. 7 this is shown in interior 
nodes 3, 6 and 7 as well as root node 0. For example, in 
interior node 6, the value is changed from "5:0" to "5:2". 
Then, when the tournament's main loop is executed, after 
the input stream has been exhausted, a compare of the new 
RUN 2 value stored in node 4 is made against the value "aa" 
in node 1. Node 1 wins because "aa" is compared with +<». 
Under the comparison rule of the present invention, "aa" 
wins in an ascending sort and values are removed from the 
left side of the sort tree. This in effect disables the right side 
of the sort tree which is populated only by RUN 2 values. 
This results in an effectively pruned tree consisting of only 
the left side, which is populated by the RUN 1 data values. 

In a third, preferred, embodiment, the sort tree is kept at 
a minimum size throughout a sort. Only those portions of the 
sort tree that are actually utilized for sorting the data are 
initialized along with the root node and the left most full 
branch of the tree; i.e. the branch from the root node to the 
left most exterior node of the tree. Thus, rather than starting 
out with a sort tree initialized with RUN 0 initialization 
values, the sort tree starts out only partially initialized. First, 
the left most full branch of the tree is initiahzed, then the 
lowest interior node on this branch is set to point at the root 
node. All other nodes are not initiahzed. As illustrated in 
FIG. 8, instead of initiaMzing the entire sort tree, only 
interior nodes 0, 1, 2, 4 and 8 and the eriterior nodes they 
point at are initialized with winner values. The root node 85 
is set above the lowest interior node in the branch, namely 
interior node 8. 

Then data items may be input. As the initiahzed portions 
are filled, additional branches may be initialized as needed. 
Preferably, right before an actual value is inserted into an 
exterior node, the tree is restructured so a new node points 
at the root node, if the tree needs to be expanded to 
accommodate the new value. The tree is expanded if there is 
50 only one empty initialized exterior node left in the current 
tree. The exterior and interior nodes needed for the expanded 
tree defined by the new root node are initialized and the node 
above the current node pointing at the root node is set to 
point at the root node. FIG. 14 illustrates steps for loading 
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determine whether they correspond to RUN 0 initialization 55 data and setting (i.e. updating) which node will point to root 
values; then RUN 2 values are substituted. The RUN 2 loser 
values are selected to be losers, namely, they will lose in any 
comparison with a data value that is read in. One possibility 
is to select the RUN 2 loser values to correspond to the value 
assigned to the first EOF detected. For example, in an eo 
ascending sort, +00 will be a loser value, so +00 is substituted 
for all RUN 2 values. In effect, the unused exterior nodes 
become dormant background of the sort tree. Because of the 
compare rule in this invention, they will be bypassed by any 
data values in any compare operation. 55 

FIGS. 6 and 7 illustrate this embodiment. FIG. 6 illus- 
trates an eight exterior node sort tree into which four data 



node. 

A preferred method to determine whether a new node 
needs to point at the root node is based on ascertaining the 
Loser Value of the node above the interior node pointing at 
the root node. That Loser Value corresponds to the number 
of the exterior node that will be used to receive a data item 
after the current sort tree is filled. For example, referring to 
FIG. 3, if the current sort tree descends from interior node 
4, then the node above interior node 4 is interior node 2. The 
Loser Value of interior node 2 is "3". 'llius, the number of 
the exterior node that will be used to receive a data item 
when the current sort tree is filled will be exterior node 3, 
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which corresponds to the Loser Value of interior node 2. 
Since exterior node 3 is outside the current sort tree (which 
descends from interior node 4), then an additional branch 
encompassing exterior node 3 will be initialized. 
Specifically, the highest interior node will move to interior 5 
node 2. All nodes ascending from exterior node 3 to interior 
node 2 will be initialized. Thus, interior node 5 and exterior 
node 3 will be initialized. This process is repeated until all 
data values are input. 

A preferred method for initializing nodes is to use two 
indicators. The first indicator points to the interior node 
above the last exterior node that was filled and the second 
indicator points to the interior node above the next exterior 
node to be filled. If the level of the second indicator is at a 
higher level than the first indicator, the first indicator is 
moved up. If the node pointed to by the second indicator is 
the same as the first pointer then the initialization process 
stops. Assuming the initialization process continues, the 
interior node the second indicator points at is initiaUzed and 
both indicators are then moved up to the next level in the 
hierarchy. If the two indicators then point to the same 
interior node, the initialization process stops. On the other 
hand, if the two indicators point to different interior nodes, 
the initialization and comparison process is repeated. At 
some point, the two indicators will point to the same node 
and the initialization process is completed. 

FIG. 9 illustrates the input of the second data item into the 
sort tree depicted in FIG. 8. The uninitialized nodes are 
shown as empty. Specifically, interior nodes 8, 4, 2, 1 and 0 
are initialized along with corresponding exterior nodes 0, 1, 
2, 3, and 5. The value of "aa" is input into exterior node 0, 
and bb into exterior node 1. To input a new value it is 
necessary to initialize another branch because adding a new 
value will cause a RUN 1 value to be moved above interior 
node 4. The root node is currently pointed at.by the highest 
interior node pointing at a real number, i.e. node number 4. 
The next data item is read into exterior node 2. When 
exterior node 2 is filled, there will be no other space 
available in the initialized branch. 

When all the nodes in the initialized branch are occupied, 
the node that points to the root node is changed and 
initiahzation occurs. The node above the node pointing at 
the root node is set to point at the root node. In FIG. 9, the 
node that needs to change corresponds to interior node 2 and 
its corresponding exterior node, number 3. FIG. 10 illus- 43 
trates this stage. In FIG. 10, the initialized nodes contain 
values and the unintialized nodes are empty. Specifically, the 
branch descending from interior node 2 is initialized and this 
node now points at the root node. Also in FIG. 10, a third 
data value has been input. Exterior node 2 is populated with jq 
value "cc" and interior node 4 is initialized. 

In this embodiment, only one compare is needed to 
remove the finst RUN 0 initialization value. Specifically, the 
RUN 0 initialization value for exterior node 0 is extracted 
once the first RUN 1 value is input into exterior node 0. This 55 
is illustrated in HGS. 8 and 9. In FIG. 8, the RUN 0 
initialization value for exterior node 0 (0:0) is in the root 
node 85. In FIG. 9, that value is removed. 

The embodiments of the present invention may also be 
combined. In various data sorts, combining the embodi- go 
ments may achieve faster sorts. For example, the third 
embodiment may be used in conjunction with the second 
embodiment. Thus, when the last data item is read in and the 
sort tree has been built, all RUN 0 initialization values may 
be changed to RUN 2 loser values as described above. 55 

'llie first embodiment may be further optimized when 
used in conjunction with the second embodiment. For 



example, when the last data item is read in and, as described 
in the first embodiment, the sort tree is reduced, there still 
may be remaining RUN 0 initialization values in the reduced 
sort tree. In accordance with the second embodiment, all 
remaining RUN 0 initialization values may be changed to 
RUN 2 loser values. 

After the data sorts described above are completed, the 
sorted data may be read from the volatile memory 40 to the 
nonvolatile memory 50 is the order of the sorted data values. 
Alternatively, as each data value is retrieved from the root 
node, it may be read from the volatile memory 40 to the 
nonvolatile memory 50 in order. 

Ill us, it is seen that an apparatus and method for dynami- 
cally sorting database data is provided. One skilled in the art 
will appreciate that the present invention can be practiced by 
other than the preferred embodiments which are presented 
for purposes of illustration and not of limitation, and the 
present invention is only limited by the claims which follow. 

What is claimed is: 

1. A method of sorting and storing data in a computer 
system, the computer system including a Central Processor 
Unit (CPU), nonvolatile memory accessible by the CPU, and 
working memory associated with the CPU, the nonvolatile 
memory including a plurality of data records stored therein, 
comprising the steps of: 

reading said data records from said nonvolatile memory 
and storing them in said volatile working' memory; 

assigning a unique data record identifier to each data 
record in said volatile memory; 

creating and initializing a sort tree in said volatile 
memory, said sort tree including a plurality of nodes 
allocated to locations in said volatile memory, said 
nodes including a plurality of exterior nodes, a plurality 
of interior nodes, and a root node; 

initializing said sort tree in combination with entry of said 
data record identifiers into said sort tree so as to add 
nodes to the sort tree in accordance with a number of 
data records added, so that the sort tree is initialized to 
the extent that it is only large enough to hold the data 
records entered; 

sorting said data record identifiers by comparing said data 
record identifiers throughout said sort tree to said root 
node; and - 

reading said data records from said volatile memory and 
storing them in said nonvolatile memory in the order of 
said sorted record identifiers. 

2. A method as set out in claim 1, wherein said data record 
identifiers include a sorting key indicating a characteristic of 
the record desired to be sorted and a pointer identifying a 
unique volatile memory location for each record and 
wherein said step of sorting comprises sorting based on 
comparing sorting key values. 

3. A method as set out in claim 1 wherein said creating and 
initiahzing step comprises: 

serially initializing exterior nodes and serially introducing 
said data record identifiers into said external nodes as 
the data record identifiers are entered, the first of said 
data record identifiers being introduced into the first 
created external node and subsequent data record iden- 
tifiers being associated with consecutive created exter- 
nal nodes; 

initializing internal nodes directly above said exterior 
nodes; 

associating initialization values with said internal nodes; 
and 
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assigning a node to point to the root node which is 
redefined as exterior nodes are added. 

4. A method as set out in claim 1, wherein said initialized 
sort tree height is determined by said initializing step. 

5, A computer program product comprising: a computer 5 
usable medium having computer readable code embodied 
therein for causing sorting of data records, the computer 
program product comprising: 

computer readable program code devices configured to 
cause a computer to effect reading said data records 
from a nonvolatile memory and store storing them in 
said volatile working memory; 

computer readable program code devices configured to 
cause a computer to effect assigning a unique data 
record identifier to each data record in said volatile 
memory; 

computer readable program code devices configured to 
cause a computer to effect creating a sort tree in said 
volatile memory, said sort tree including a plurality of 
nodes allocated to locations in said volatile memory, 
said nodes including a plurality of exterior nodes, a 
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plurality of interior nodes, and a root node, said nodes 
being initiahzed in combination with entry of said data 
record identifiers into said sort tree so as to add nodes 
to the sort tree as data records are added; 

computer readable program code devices configured to 
cause a computer to effect sorting said data record 
identifiers by comparing said data record identifiers 
through said sort tree to said root node; and 

computer readable program code devices configured to 
cause a computer to effect reading said data records 
from said volatile memory and storing them in said 
nonvolatile memory in the order of said sorted record 
identifiers. 

6. A computer program product as set out in claim 5, 
wherein said code is fonmed thereon as magnetic flux 
reversals. 

7. A computer program product as set out in claim 5, 
wherein said code is formed thereon as changes in optical 
reflectivity of the medium. 
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